Have arrays' drop_glue just unsize and call the slice version#155184
Conversation
|
|
|
|
||
| // RAW: ; core::ptr::drop_glue::<[array_drop_glue::NeedsDrop; [[N:7|13|42]]]> | ||
| // RAW-NEXT: inlinehint | ||
| // RAW: call core::ptr::drop_glue::<[array_drop_glue::NeedsDrop]> |
There was a problem hiding this comment.
Compare nightly https://rust.godbolt.org/z/5Wv1q86ja with a loop in each of the three.
This comment has been minimized.
This comment has been minimized.
|
💭 does it make sense to keep the array case separate for |
|
My understanding is that LLVM will very effectively inline those -- just like how we don't special case the array iterator for N ≤ 1 and still use the polymorphic version. Does make me ponder if we should have a separate general "hey, there's no |
d1bdee8 to
2ae092a
Compare
This comment has been minimized.
This comment has been minimized.
8a17db9 to
bf354cd
Compare
1f28062 to
b0020c9
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Have arrays' `drop_glue` just unsize and call the slice version
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (ec33dfc): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking means the PR may be perf-sensitive. Consider adding rollup=never if this change is not fit for rolling up. @rustbot label: -S-waiting-on-perf -perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -5.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary -2.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 497.359s -> 500.627s (0.66%) |
| start.terminator = Some(Terminator { | ||
| source_info, | ||
| kind: TerminatorKind::Call { | ||
| func: Operand::function_handle(tcx, def_id, [ty::GenericArg::from(slice_ty)], span), |
There was a problem hiding this comment.
(question) What's the def_id a def id of?
There was a problem hiding this comment.
It's the def id of the drop_glue itself. Basically the same as require_lang_item(LangItem::DropGlue), but it was already passed in to generate the drop glue we're currently making, so I didn't need to look it up -- the existing parameter on https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/shim/fn.build_drop_shim.html.
| // fragility from that changing we don't check any particular order. | ||
|
|
||
| // RAW: ; core::ptr::drop_glue::<[array_drop_glue::NeedsDrop; [[N:7|13|42]]]> | ||
| // RAW-NOT: inlinehint |
There was a problem hiding this comment.
(question) Why are you checking for lack of inlinehint here?
There was a problem hiding this comment.
Oh, because I'd not un-pushed the second commit where I experimented with not touching the inlining check.
I went back to the original which I think got overall-better perf results where the delegating ones are inlinehint but the one with the actual loop(s) isn't, and thus these tests check that.
|
|
||
| // CHECK-LABEL: ; core::ptr::drop_glue::<[array_drop_glue::NeedsDrop]> | ||
| // CHECK-NOT: inlinehint | ||
| // OPT: add nuw nsw {{.+}} 1 |
There was a problem hiding this comment.
What is this checking for?
There was a problem hiding this comment.
That it's the actual loop which got optimized to add nuw nsw despite just being Add in the code generating the loop.
b0020c9 to
05f52c7
Compare
|
@bors r=WaffleLapkin |
|
@bors rollup |
…, r=WaffleLapkin Have arrays' `drop_glue` just unsize and call the slice version It's silly to emit two loops (because of the drop ladder -- just one in panic=abort) for every array length that's dropped when we can just polymorphize to the slice version. Built atop rust-lang#154327 to avoid conflicts later, so draft for now. r? @WaffleLapkin
…uwer Rollup of 11 pull requests Successful merges: - #156448 (miri subtree update) - #155023 (Introduce move expressions (`move($expr)`) ) - #156429 (Simplify `intrinsic::raw_eq` in MIR when possible) - #147672 (LLBC-linker: Do not strip debug symbols for the nvptx target anymore) - #155169 (jsondoclint: simplify code using idiomatic Rust) - #155184 ( Have arrays' `drop_glue` just unsize and call the slice version) - #156022 (rustdoc: Fix cosmetic issues when reporting unresolved paths in `broken_intra_doc_links`) - #156442 (Show intrinsics::gpu in docs) - #156461 (LLVM 23: Specify `returnaddress` intrinsic return type) - #156462 (LLVM 23: Accept float (instead of hex) literals in codegen tests) - #156466 (Refactor `CheckAttrVisitor` so rustfmt can format it.)
View all comments
It's silly to emit two loops (because of the drop ladder -- just one in panic=abort) for every array length that's dropped when we can just polymorphize to the slice version.
Built atop #154327 to avoid conflicts later, so draft for now.
r? @WaffleLapkin